-
Notifications
You must be signed in to change notification settings - Fork 1.2k
SSZ-QL: calculate generalized indices for elements #15873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSZ-QL: calculate generalized indices for elements #15873
Conversation
… no recursion. Extended test coverage for bitlist and bitvectors. vectors need more testing
…he beginning. Swap to regex to gain flexibility.
…pe (uint64 instead of uint8)
removed extractFieldName func.
…against vector.length/list.limit
ce17749 to
73e3ee7
Compare
Co-authored-by: Jun Song <[email protected]>
…e with length >= 1 If s does not contain sep and sep is not empty, Split returns a slice of length 1 whose only element is s.
…needed in extractFieldName function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good, my unofficial approval 👍
- renamed itemLengthFromInfo to itemLength (same name is in spec). - arranged all SSZ helpers.
| } | ||
|
|
||
| // Starting from the root generalized index | ||
| root := uint64(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename it to something like currentIndex? It's a bit odd to call it root since it's not a root of anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
I got the inspiration for the root name from spec:
def get_generalized_index(typ: SSZType, *path: PyUnion[int, SSZVariableName]) -> GeneralizedIndex:
"""
Converts a path (eg. `[7, "foo", 3]` for `x[7].foo[3]`, `[12, "bar", "__len__"]` for
`len(x[12].bar)`) into the generalized index representing its position in the Merkle tree.
"""
root = GeneralizedIndex(1)
(...)But I do not like it because I do not associate it to an index.
encoding/ssz/query/path.go
Outdated
| // e.g. "array[0][1]" -> []uint64{0, 1}. Errors if none are found or if any index is invalid. | ||
| func extractArrayIndices(name string) ([]uint64, error) { | ||
| // Match all bracketed content, then we'll parse as unsigned to catch negatives explicitly | ||
| re := arrayIndexRegex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of assigning to a new variable instead of using arrayIndexRegex directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No purpose at all. I'll remove it.
encoding/ssz/query/path.go
Outdated
| if strings.HasPrefix(raw, "-") { | ||
| return nil, fmt.Errorf("cannot process negative indices %q", raw) | ||
| } | ||
| idx, err := strconv.ParseUint(raw, 10, 64) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("invalid array index: %w", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the documentation for ParseUint it seems that it doesn't allow negative numbers, so the HasPrefix check is unnecessary
// ParseUint is like [ParseInt] but for unsigned numbers.
//
// A sign prefix is not permitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Totally unnecessary check.
encoding/ssz/query/path.go
Outdated
| if len(parts) != 2 { | ||
| return nil, fmt.Errorf("invalid index notation in path element %s", elem) | ||
| } | ||
| re := regexp.MustCompile(`^\s*len\s*\(\s*([^)]+?)\s*\)\s*$`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about extracting this regex to a package-level variable, just like you did with arrayIndexRegex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to extract this one. It's now at the package-level variable.
| } | ||
| re := regexp.MustCompile(`^\s*len\s*\(\s*([^)]+?)\s*\)\s*$`) | ||
| matches := re.FindStringSubmatch(processingField) | ||
| if len(matches) == 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add an explanation why 2 is expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added explanation. I'm now considering regular expressions an overkill, when I added them I was considering input validation and correction.
| }, | ||
| wantErr: false, | ||
| }, | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add even more test cases? The ones I can think of:
- leading double dot --> error
- trailing dot --> error
len(data)--> errorlen(data.target.root)--> oklen(data.target.root).foo--> errordata.target.len(root)--> error
The easiest way to get a bunch of test cases it to pass the regex to an AI and ask it to generate them. I expect it will overdo this, but some cases can be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding them, but I see we do not follow the same convention for requests:
len(data.target.root)--> errordata.target.len(root)--> ok
As we firstly split the raw input by ., having this input len(data.target.root) would result in:
len(datatargetroot)
leading to a wrong outcome.
On the contrary, this would succeed data.target.len(root)
datatargetlen(root)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(data)--> error
I do not see an error here neither. Imagine we are querying validators field length in beacon state.
In this case the query would contain len(validators)
| // Helpers for Generalized Index calculation per type | ||
|
|
||
| // calculateLengthGeneralizedIndex calculates the generalized index for a length field. | ||
| // note: length fields are only valid for List and Bitlist types. Multi-dimensional arrays are not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be supported for Vector and Bitvector too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In relation to this, I also followed the spec algo:
for p in path:
# If we descend to a basic type, the path cannot continue further
assert not issubclass(typ, BasicValue)
if p == "__len__":
+ assert issubclass(typ, (List, ByteList))
typ = uint64
root = GeneralizedIndex(root * 2 + 1)To my understanding, there is no length field for Vector and Bitvector as they have fixed size determined by their type.
|
|
||
| // calculateLengthGeneralizedIndex calculates the generalized index for a length field. | ||
| // note: length fields are only valid for List and Bitlist types. Multi-dimensional arrays are not supported. | ||
| func calculateLengthGeneralizedIndex(fieldSsz *SszInfo, element PathElement, root uint64) (*SszInfo, uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to the comment above (changing root to currentIndex), can you rename the root param to something like parentIndex (or something else more suitable)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-renamed them 😅
changed all of them from root to currentIndex. But now I got to this comment and, for these functions, I like this parentIndex more than currentIndex.
…turns from calculate<Type>GeneralizedIndex functions
Co-authored-by: Radosław Kapka <[email protected]>
Head branch was pushed to by a user without write access
Head branch was pushed to by a user without write access


What type of PR is this?
Feature
Which issues(s) does this PR fix?
Partially #15598
What does this PR do? Why is it needed?
This PR develop a way to calculate the Generalized Indices of a given path within a SSZ Object. To do so, it follows Consensus Spec's Merkle proofs.
A conversion from PathElement to Generalized Index is necessary to work with fastssz proofs library.
The code implemented walks the path within the SSZInfo struct in a consensus layer spec way. We have to take into account that the "path" input is slightly different:
The pythonic version expects a Path input
[FieldA,"FieldB",3]while the Go version expectsfield_a.field_b[3].Other notes for review
Acknowledgements